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ABSTRACT 

Here is proposed a review of the different choices to 
structure spike trains, using deterministic metrics. Tem- 
poral constraints observed in biological or computational 
spike trains are first taken into account The relation with 
existing neural codes (rate coding, rank coding, phase cod- 
ing, ..) is then discussed. 

To which extend the "neural code" contained in spike 
trains is related to a metric appears to be a key point, a gen- 
eralization of the Victor-Purpura metric family being pro- 
posed for temporal constrained causal spike trains. 
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1 Introduction: spike trains in the real life 

The output of a neural network is a set of events, de- 
fined by their occurrence times, up to some precision: 

{■■■'■;■■■}. i; ■ i; /;• • 

where t " corresponds to the nth spike time of the neuron of 
index i. Such spike train writes pi(t) = Ylt n eF- — 
with related inter-spike intervals gJ™ = t™ — , using the 
Dirac symbol S(.). See e.g. flTJ [21 El for an introduction. 

In computational or biological contexts, not all multi- 
time sequences correspond to spike trains since it is con- 
strained by the neural dynamic, while temporal constraints 
are to be taken into account: Spike-times are: 

- [CI] bounded by a refractory period r,r < 

- [C2] defined up to some absolute precision St, while 

- [C3] there is always a minimal delay dt for one spike to 
be able to interact to another, and 

- [C4] there is a maximal inter-spike interval D such that 
either d" +1 < D or = +00 (i.e. either neuron fires 
within a time delay < D or it remains quiescent forever). 

For biological neurons, typically, in milliseconds: 
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The [CI] constraint is well-known, limiting the max- 
imal rate. [C2] seems obvious but is often "forgotten" 
in model. In rank coding schemes for instances (4) it is 
claimed that "all" spike time permutations are significant, 
which is not realistic since many of these permutations are 
indistinguishable, because of the bounded precision, as dis- 
cussed in e.g. 0. Similarly, a few concepts related to "liq- 
uid states" [6] assume implicitly an unrealistic unbounded 
time precision. Similarly, [C3] is also obvious and allows 
to avoid any causal paradox (e.g.: avalanche effect), but the 
induced simplifications are not always made explicit. 



The [C4] constraint is less obvious. The idea is that 
all neurons have a "leak". Thus, in the absence of input, 
the potential decreases and neuron cannot fire anymore. In 
the brain, current observations also show that a neuron is 
either firing or . . dead 0J. Now [C4] is easily violated for 
neural model with constant current input, able to integrate 
during a unbounded period of time, or with maintained sub- 
threshold oscillations. As discussed in details in Q the fact 
[C4] is verified or not, completely changes the nature of 
the dynamics. In the latter case, "ghost orbit" occurs: A 
"vicious" neuron can remain silent a very long period of 
time, and then suddenly fire inducing a complete change in 
the non-linear system. We distinguish situations with and 
without [C4] in the sequel. 

Considering C[l-3] and eventually [C4], let us now 
review the related consequences regarding modeling^. 

2 The maximal amount of information 

Considering [CI -2], given a network of spiking neurons 
observed during a finite period [0,1?], the number of pos- 
sible spikes is obviously limited by the refractory period r. 
Furthermore, the information contained in all spike times is 
strictly bounded, since two spike occurrences in a St win- 
dow are not distinguishable. 

A rather simple reasoning J8) yields an upper bound 
for the amount of information: 

N ® log 2 (-p-) bits during D seconds 
Taking the numerical values into account it means for large 
D, in milliseconds, about D \og 2 (D) bits/neuron. 

In the particular case of fast-brain mechanisms where 
only "the first spike matters" [9], this amount of infor- 
mation is not related to the permutations between neuron 
spikes, i.e. of order of o(log(AM)) = N log(iV) but simply 
proportional to N, in coherence to what is found in 0. 

This bound is coherent with results presented in [ 1 1 
considering spike rate and using an information entropy 
measure. For instance, considering a timing precision of 
0.1 — 1ms as derived here, the authors obtain an informa- 
tion rate bounded around 500bits/s for a neural receptor. 

Note that this is not bad, but good news. For in- 
stance in statistical learning, this corresponds to a cod- 
ing with large margins, thus as robust as support-vector 
machines, explaining the surprisingly impressive perfor- 
mances of fast-brain categorization [5 |. 



'They also induce important consequences at the simulation level |8 1. 



3 Dynamics of time-constrained networks 



4 Neural coding and temporal constraints 



A step further, taking [Cl-3] into account, allows to "dis- 
cretize" the spike trains sequences: i.e. use "rasteiQ". The 
sampling period AT is taken smaller than r, St and dt. 

In simple models such as basic leaky integrate and fire 
neuron or integrate and fire neuron models with conduc- 
tance synapses and constant input current, this discretiza- 
tion allows a full characterization of dynamics. Thus, it has 
been shown in these two cases that ATI 171: 

• [HI] The raster plot is genetically periodic, but, de- 
pending on parameters such as external current or 
synaptic weights, periods can be larger than any ac- 
cessible computational time; 

• [H2] There is a one-to-one correspondence between 
orbits and raster (i.e. raster plots provides a symbolic 
coding for dynamics). 

The fact [HI] allows to clearly understand to which 
extends spike trains can code information: Periodic orbits 
give the code. The fact [H2] means that, in these cases, 
the raster is a "symbolic coding" in the sense that no infor- 
mation is lost by considering spike times and not the whole 
neural state. Both facts also allow one to deeply understand 
the network dynamics: Fig.[TJsketches out some aspects. 




Figure 1. Describing the dynamic landscape of deterministic 
time-constrained networks. [A] The phase space is partitioned 
into bounded domains Bi and for each initial condition in Bi the 
initial trajectory is attracted to an attractor, here not a fixed point, 
as in, e.g. Hopfield networks, but a periodic orbit At. [B] If the 
parameters (input, weights) change, the landscape is modified and 
several phenomena can occur: change in the basins shape, number 
of attractors, modification of the attractor as for A3 in this exam- 
ple; A point belonging to Aa in FigQ]A, can, after modification 
of the parameters, converge either to attractor A'2 or A3. 

To which extends such "canonical situation" extends 
to more complex models is an open question and we can 
easily conjecture that it is not true for, e.g. Hodgkin-Huxley 
lfl2l neuron's assembly. However it is at least true for a 
large class of computational models actually at the state of 
the art, enlightening the kind of code they may generate. 



Formally, the spike train discretized raster, writes for k > 0: 
m[k] = #{t",kAT< t? < (fc + 1) AT} e {0,1} 



Let us now introduce the central idea of this review. 

As an illustrative example, let us consider the tem- 
poral order coding scheme J4] [9J (i.e. rank coding): only 
the order of the events matters, not their absolute time val- 
ues. This means that two spike trains T\, with the same 
event ordering correspond to the same code. The key point 
here, is that rank coding defines a partition of spike trains 
set, each spike train with the same ordering being in the 
same equivalent class of the partition, i.e. corresponding to 
the same "code". Choosing this code means we have struc- 
tured spike trains using an "equivalent relation" (i.e. T\ 
and T2 are equivalent if they correspond to the same code). 

The same view can be given for other coding: rate 
coding means that all spike trains with the same frequen- 
cies are in the same equivalence class, irrespective of their 
phase, etc.. 

Let us now introduce a "distance" d(.), which is if 
T\ and JF 2 correspond to the same code, and 1 otherwise. 

The fact that spikes precision is not unbounded leads 
to many indistinguishable orderings. This does not change 
the rank coding concept, while the partition is now coarser. 
Trains with two spikes with indistinguishable occurrence 
times are in the same equivalence class. 

A step further, how can we capture the fact that, e.g. 
for rank coding, two spike times with a difference "about" 
St are "almost" indistinguishable ? The natural idea is to 
use, not discrete distances (i.e. with binary 0/1 values) but 
a "quantitative" distance. Two spike trains correspond ex- 
actly to the same neural code if the distance is zero and the 
distance increases with the difference between the codes.. 

This is the idea we wanted to highlight here. This 
proposal is not a mathematical "axiomatic", but a simple 
modeling choice. It is far for being new, but surprisingly 
enough not made explicit at this level of simplicity. In order 
to see the interest of this idea, let us briefly review the two 
main classes of spike train metrics. 

As reviewed in details in l3l [T3ll spike trains determin- 
istic metrics can be grouped in three classes: 
-0- "Binned" metric, with spikes grouping in bins (e.g. rate 
coding metrics), not discussed here. 

-I- Convolution metrics, with a distance defined on some 
convolution of spike train, including raster-plot metric. 
-II- Spike time metrics, such as alignment distances lT4ll 

5 Using convolution metrics 

Linear representation. A large class of metrics de- 
rives from the choice of a convolution kernel K writing: 

*(*) = Et» e ^ Ki{t - tf) =Ki* Pi e]0, 1], 
easily normalized between (no spike) and, say, 1 (burst 
mode at the maximal frequency). The distance is then de- 
fined on the signal s = (•••, Sj, •••) € 1Z N , e.g. using L p 
norms. The "code" here corresponds to the linear represen- 
tation metric. It allows to link spike trains with a quantita- 
tive signal s. 
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Figure 2. A few examples of spike train convolution: [A] The 
spike train itself, [B] A causal local frequency measure estima- 
tion (writing \ me indicatrix function), [C] A non-causal spike 
density, uniformly equal to 1 in burst mode, [D] A normalized 
causal exponential profile. Evoked post-synaptic potential pro- 
files model are nothing but such causal convolution (using e.g. 
double-exponential kernels to capture the synaptic time-constant 
(weak delay) and potential decay). Similarly spike-trains repre- 
sentations using Fourier or Wavelet Transforms are intrinsically 
related to such convolutions. 



The so-called "kernel methods" based on the Mercer 
theorem (3) are in direct links with linear representation 
since they are defined, as scalar products, writing: 

k(T, T>) = Ei T,n,m Mi? ~ = It *(*) 

with direct correspondences for usual kernels with linear 
convolutions, e.g.: 
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This also includes distances based on inter-spike in- 
tervals as developed in e.g. ifTSIl . 

Non static kernels of the form Ki (t, t — tf) or Kf (t — 
tf) can also be used (clock-dependent coding, raster plot, 
1st spike coding, ..), while non-linear Volterra series al- 
lows to represent "higher order" phenomena (see e.g. fTOll ). 
Weighted spike trains (i.e. with quantitative values attached 
to each spike) are also direct generalizations of these. 

These linear representations not only provide tools to 
compare different spike trains, but allows one to better un- 
derstand the link between quantitative signals and spike 
times. For instance [UEl, writing s(t) — J2iXiSi(t) 
lows to define some network readout to link spiking net- 
works to "analog" sensory-motor tasks. Let us illustrate 
this aspect by the following two results. 



Kernel identification. Given a causal signal s generated 
by spike train T the problem of identifying the related ker- 
nel is formally solved by the following program: 

min* j t>Q \s(t) - m\ 2 = fx \K(X) p(X) - s(A)| 2 
using the Laplace transform Parseval theorem, thus: 

K(X) = [s(X)p(X) T ] [p^piXf]- 1 
i.e. the spike train cross-correlation / auto-correlation ratio. 
Non-causal estimation would consider the Fourier trans- 
form. This setting corresponds to several identification 
methods 0] 0. 

Signal reconstruction. In order to further understand the 
power of representation of spike trains in this case [ 16 1 has 
generalized the well-known Shanon theorem, as follows: A 
frequency range [— J2, O] signal is entirely defined by irreg- 
ular sampling values s" at "spike time" tf 

*(<) = £„ K ?(t - *?) = »? ^ 

providing that max n d™ < Thus providing an explicit 
signal "decoding". 



Raster-plot metrics. A step further, it is easy to see, that 
representing the spike time by a "rastetEJ' corresponds to 
non-static convolution kernel. Spike trains can be repre- 
sented as the real number in [0. . 1 [ which binary representa- 
tion corresponds to the spike-train, inducing new metrics. 
A useful related metric is of the form, for 9 g]0, 1[: 



de{u,uj') 



, T = argmaXj u> 



thus capturing the fact that two rasters are equal up to a cer- 
tain rank. Such metrics are used to analyze the dynamics 
of spiking networks and are typically used in the context of 
symbolic coding in dynamical systems theory ifTTl 171. 



6 Using alignment metrics 

The original alignment metric. The second family of 
metrics we want to review directly considers spike times 

The distance between two finite spike trains T, T' is 
defined in terms of the minimum cost of transforming one 
spike train into the other. Two kinds of operations are de- 
fined: 

- spike insertion or spike deletion, the cost of each opera- 
tion being set to 1 

- spike shift, the cost to shift from e T to t'" 1 £ T' 
being set to |t" — t™\/r for a time constant r. 

For small r, the distance approaches the number of 
non-coincident spikes, since instead of shifting spikes it 
is cheaper to insert/delete non-coincident spikes, the dis- 
tance being always bounded by the number of spikes in 
both trains. 

For high r the distance basically equals the difference 
in spike number (rate distance), while for two spikes trains 
with the same number of spikes, there is always a time- 
constant r small enough such that the distance is equal to 

En\t?-< n \/T- 



Here, two spikes times are comparable if they oc- 
cur within an interval of 2 r, otherwise they better are 
deleted/inserted. 

Although computing such distance seems subject to 
a combinatorial complexity, it appears that quadratic algo- 
rithms are available (i.e. with a complexity equal to the 
product of the number of spikes). This is due to the fact that, 
in a minimal path, each spike can be either deleted or shifted once 
to coincide with a spike in the other spike train. Also, a spike 
can be inserted only at a time that matches the occurrence of a 
spike in the other spike train. It allows to calculate iteratively 
the minimal distance considering the distance d n , n ' (-T 7 , J-') 
between a spike train composed of the first n spikes of T 
and the first n' spikes of T' . 
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This modified metric illustrates how versatile is this 
class of distance to represent the differences between spike 
trains, considering temporal properties only. 

Spike training. As a formal application, let us consider a 
neuron spike response model ifTSl of the form: 

v.® = u(t - tr 1 ) + E jm «>y a (t - tf), tr 1 <*<*?, 

the spike time being defined by V^(i") = 6, where 6 is the 
spiking threshold, thus as an implicit equation. 

Previous metrics on spike times allows to optimize the 
neural weights in order to tune spike-times, deriving, e.g., 
rules of the form: 

where T — {■ ■ ■ , tf, ■ ■ ■ } is the desired spike train 

Such mechanisms of optimization is also applicable 
to time-constants, delays or thresholds. Using spike train 
metrics open the door to the formalization of such adapta- 
tion rules, in order to "compute with spikes". 



Figure 3. An example of minimal alignment from the upper to 
the lower spike train, using from top to bottom an insertion, a 
rightward shift, a leftward shift and a deletion respectively. 

When considering spike trains with more than one 
unit, one point of view is to sum the distances for each 
alignment unit-to-unit. Another point of view is to consider 
that a spike can "jump", with some cost, from one unit in T 
to another unit in T' . The related algorithmic complexity is 
no more quadratic but on the power of the number of units 
E). 

This family of metrics include aligment not only on 
spike times, but also on inter-spike intervals, or metrics 
sensitive to motifs of spikes, etc.. They have been fruit- 
fully applied in a variety of neural systems, to characterize 
neuronal variability and coding [ 13 1. For instance, neurons 
that act as a coincidence detector with integration time (or 
temporal resolution) r, spike trains will have similar post- 
synaptic effects if they are similar for this metric. 

A generalized alignment metric. Let us remark, here, 
that the previous metric generalizes to metric whith: 

- [causality] At a given time the cost of previous spikes 
alignment decreases with the obsolescence of the spike, 
say, with an exponential time-constant r'. 

- [non-linearity] The cost of a shift can be defined very 
small, say quadratic, when lower that the time precision 
and then, say, linear with the time difference. 

This leads to an iterative definition of the distance 
dn, n > defined previously: d n>n < = 
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with, e.g., 4>{d) — ((dr/St) 2 A d, again implementable 
in quadratic time. It corresponds to the original alignment 
metric iff (f>() is the identity and t' = +oo. 



7 Discussion 

Although probabilistic measures of spike patterns such as 
correlations ifTSI or entropy based pseudo-distances (e.g. 
mutual information) provides a view of spike trains vari- 
ability which is enriched by the information theory concep- 
tual framework, it may be difficult to estimate them in prac- 
tice, since such measures are robust only if a large amount 
of samples is available. On the contrary, deterministic dis- 
tances allow to characterize aspects of spike coding, with 
efficient methods and without this curse of sampling size. 

This review highlights some of these methods and 
propose to consider that "choosing a coding" means "defin- 
ing a metric", in the deterministic case. Making explicit 
that spikes do not contain that much information. 
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